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Abstract 

Mosquitoes are hosts of several Spiroplasma species that belong to different serogroups. To investigate the genetic mechanisms that 
may be involved in the utilization of similar hosts in these phylogenetically distinct bacteria, we determined the complete genome 
sequences of Spiroplasma diminutum and 5. taiwanense for comparative analysis. The genome alignment indicates that their 
chromosomal organization is highly conserved, which is in sharp contrast to the elevated genome instabilities observed in other 
Spiroplasma lineages. Examination of the substrate utilization strategies revealed that S. diminutum can use a wide range of carbo- 
hydrates, suggesting that it is well suited to living in the gut (and possibly the circulatory system) of its mosquito hosts. In comparison, 
S. taiwanense has lost several carbohydrate utilization genes and acquired additional sets of oligopeptide transporter genes through 
tandem duplications, suggesting that proteins from digested blood meal or lysed host cells may be an important nutrient source. 
Moreover, one glycerol-3-phosphate oxidase gene (glpO) was found in 5. taiwanense but not 5. diminutum. This gene is linked to the 
production of reactive oxygen species and has been shown to be a major virulence factor in Mycoplasma mycoides. This finding may 
explain the pathogenicity of S. taiwanense observed in previous artificial infection experiments, while no apparent effect was found 
for 5. diminutum. To infer the gene content evolution at deeper divergence levels, we incorporated other Mollicutes genomes for 
comparative analyses. The results suggest that the losses of biosynthetic pathways are a recurrent theme in these host-associated 
bacteria. 
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Introduction 

The complete genome sequence of an organism provides 
biologists with the opportunity to examine the presence or 
absence of certain genes that may explain its phenotype. 
For this reason, comparative analysis of genomes between 
related organisms with phenotypic differences is a powerful 
tool to investigate the underlying genetic mechanisms. In this 
work, we chose two mosquito-associated bacteria in the 
genus Spiroplasma as the study system and utilized a compar- 
ative genomics approach to infer their metabolic differentia- 
tions and gene content evolution. 



Taxonomically, the genus Spiroplasma is described as a 
group of helical, motile, and wall-less bacteria in the class 
Mollicutes (Whitcomb 1981; Gasparich et al. 2004; Regassa 
and Gasparich 2006; Gasparich 2010). Similar to other 
members of this class, such as the vertebrate-pathogenic 
Mycoplasma and the plant-pathogenic Candidatus 
Phytoplasma, all characterized Spiroplasma species are 
found to be associated with eukaryotic hosts. Most com- 
monly, spiroplasmas are associated with insects, such as var- 
ious flies and mosquitoes in the order Diptera or various 
beetles in the order Coleoptera (Hackett et al. 1992; 



© The Author(s) 2013. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. 

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.Org/licenses/by-nc/3.0/), which permits 
non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contactjournals.permissions@oup.com 



1512 Genome Biol. Evol. 5(8): 1 512-1 523. doi:10.1093/gbe/evt108 Advance Access publication July 19, 2013 



Complete Genome Sequences of Spiroplasma diminutum and 5. taiwanense 



GBE 



Gasparich et al. 2004). Although most of these insect-associ- 
ated spiroplasmas are not known to have any apparent effect 
on their hosts (Gasparich 2010), a small number of 
Spiroplasma lineages have been found to be either beneficial 
or pathogenic. For example, several uncultivated spiroplasmas 
can provide protection against parasitic nematodes (Jaenike 
et al. 201 0), parasitoid wasps (Xie et al. 201 0, 201 1 ), or fungal 
pathogens (Lukasik et al. 2013) in their Drosophila or aphid 
hosts. Alternatively, notable examples of harmful spiroplasmas 
include the honeybee-pathogenic Spiroplasma melliferum 
(Clark et al. 1985) and S apis (Mouches et al. 1983), the 
male-killing spiroplasmas in Drosophila and other insects 
(Williamson et al. 1999; Hurst and Jiggins 2000; Anbutsu 
and Fukatsu 2003; Tabata et al. 2011), and the mosquito- 
pathogenic 5. culicicola and 5. taiwanense (Humphery-Smith 
et al. 1991a, 1991b; Vazeille-Falcoz et al. 1994; Phillips and 
Humphery-Smith 1995). Because of their insect pathogenicity 
and relatively high host specificity, these spiroplasmas may be 
developed into biocontrol agents for insect pests (Anbutsu 
and Fukatsu 2011). 

For biological control of insect pests, much attention has 
been given to mosquitoes because of the public health con- 
cerns (Federici et al. 2003). To date four Spiroplasma species 
have been isolated from mosquitoes, including 5. culicicola 
from the salt marsh mosquito Aedes sollicitans collected in 
New Jersey, USA (Hung et al. 1987), 5. sabaudiense from a 
mixed pool of A sticticus and A. vexans collected in the French 
Northern Alps (Abalain-Colloc et al. 1987), and two species 
from mosquitoes collected in Taiwan: 5. taiwanense from 
Culex tritaeniorhynchus (Abalain-Colloc et al. 1988) and 
5. diminutum from C. annulus and C. tritaeniorhynchus 
(Williamson et al. 1996). Interestingly, artificial infection exper- 
iments revealed that these Spiroplasma species exhibit differ- 
ent levels of pathogenicity toward their mosquito hosts. While 
5. diminutum can replicate inside A. albopictus, the infection 
does not reduce the host lifespan (Vorms-Le Morvan et al. 
1991). In contrast, infection of the yellow fever mosquito A. 
aegypti by 5. taiwanense significantly reduces the survival of 
larvae (Humphery-Smith et al. 1 991 a) and the lifespan of adult 
females (Humphery-Smith et al. 1991b; Vazeille-Falcoz et al. 
1994). A histopathological study that used Anopheles ste- 
phensi as the host has shown that 5. taiwanense can replicate 
both extra- and intra-cellularly in the host hemolymph, hemo- 
cytes, thoracic flight muscles, neural system, and other tissues 
(Phillips and Humphery-Smith 1995). Moreover, the infected 
mosquitoes exhibit loss of flight ability and reduced mobility, 
which are linked to extensive cell lysis and polysaccharide de- 
pletion in the thoracic flight muscles. Finally, cytadsorption of 
5. taiwanense was associated with the swelling and subse- 
quent lysis of A. albopictus C6/36 cells in vitro (Chastel and 
Humphery-Smith 1991). 

To investigate the genetic mechanisms that may explain the 
differences in pathogenicity toward their mosquito hosts in 
previous artificial infection experiments, we determined the 



complete genome sequences of 5. diminutum and 5. taiwa- 
nense in this study for comparative analysis. In addition to 
providing candidate genes for future characterization of viru- 
lence factors, comparisons with other available genome 
sequences, such as the honeybee-pathogenic 5. melliferum 
(Alexeev et al. 2012; Lo et al. 2013) and the vertebrate-path- 
ogenic Mycoplasma species (Sasaki et al. 2002; Thiaucourt 
et al. 2011), can further improve our understanding of 
genome evolution in these host-associated bacteria. 

Materials and Methods 

Molecular Phylogenetic Inference 

To infer the evolutionary relationship among the Spiroplasma 
lineages of interest, we used 16S rDNA and DNA-directed 
RNA polymerase subunit beta (rpoB) to construct a molecular 
phylogeny. The sequences were obtained from the NCBI 
nucleotide database (Benson et al. 2012) and the correspond- 
ing accession numbers are provided in supplementary table 
S1, Supplementary Material online. These two genes were 
aligned separately using MUSCLE v3.8 (Edgar 2004) with 
the default settings and concatenated into a single dataset 
with 6,585 aligned nucleotide sites. A maximum likelihood 
phylogeny was inferred using PhyML v3.0 (Guindon and 
Gascuel 2003) with the GTR + I + G model and six substitution 
rate categories. To estimate the levels of clade support, we 
generated 1,000 nonparametric bootstrap samples using the 
SEQBOOT program of PHYLIP v3.69 (Felsenstein 1989). For 
the species in the Apis clade, including the four mosquito- 
associated Spiroplasma species, we collected the information 
of host association from the literature and provided a sum- 
mary in supplementary table S2, Supplementary Material 
online. 

Strain Source and DNA Preparation 

The two focal bacterial strains, 5. diminutum CUAS-1 T (ATCC 
49235) and S taiwanense CT-1 T (ATCC 43302), were ob- 
tained from the American Type Culture Collection (ATCC). 
The freeze-dried culture samples were processed according 
to the protocol provided by ATCC. Briefly, the samples were 
rehydrated by adding 5 ml ATCC 988 medium, titrated by 
serial dilution, and incubated in 30 °C without shaking until 
the medium turned yellow. The minimum concentration that 
showed spiroplasma growth was then transferred into R 2 
medium (Moulder et al. 2002) for DNA extraction using the 
Wizard Genomic DNA Purification Kit (Promega, USA). For 
each DNA sample, we amplified the 16S rDNA using the 
primer pair 8F (5 / -agagtttgatcctggctcag-3 / ) (Turner et al. 
1999) and 1492R (5 / -ggttaccttgttacgactt-3 / ) (Ochman et al. 
2010) for Sanger sequencing to confirm the sample identity 
and that no contamination has occurred. 
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Genome Sequencing and Assembly 

To determine the genome sequences of S. diminutum and 
5. taiwanense, we used a commercial service provider 
(Yourgene Bioscience, Taipei, Taiwan) for whole-genome 
shotgun sequencing with the 101 -bp reads produced on the 
lllumina HiSeq 2000 platform (lllumina, USA). The procedure 
for de novo assembly was based on that described previously 
(Chung et al. 2013; Lo et al. 2013). Briefly, raw reads were 
quality-trimmed and filtered based on usable length. The re- 
sulting high quality reads were used as the input for the 
assembler of choice to produce draft assemblies (more details 
below). Subsequently, the draft assemblies were improved 
using an iterative procedure until the chromosomes and plas- 
mids were sequenced to completion. For each iteration, we 
mapped all raw reads to the existing scaffolds using BWA 
vO.6.2 (Li and Durbin 2009) and visualized the results with 
IGV v2. 1 .24 (Robinson et al. 201 1 ). Paired reads that extended 
the existing contigs or supported the linkage between contigs 
were used to improve the assembly. The MPILEUP program in 
the SAMTOOLS v0.1 .18 package (Li et al. 2009) was used to 
identify polymorphic sites. Primer walking and additional 
Sanger sequencing were used to fill the gaps and to verify 
the assembly. 

For 5. taiwanense, we utilized one paired-end library (insert 
size=192bp, 47,312,605 read-pairs, approximately 9.6Gb 
of raw data). The initial de novo assembly was performed 
using VELVET v1.2.07 (Zerbino and Birney 2008) with the 
parameters k-mer, expected coverage, and coverage cutoff 
set to 89, 1200, and 100, respectively. For 5. diminutum, 
we utilized one paired-end library (insert size =178 bp, 
44,436,475 pairs, approximately 9.0 Gb of raw data) and 
one mate-pair library (insert size = ~4.1kb, 18,273,021 
pairs, approximately 3.7 Gb of raw data). The initial de novo 
assembly was performed using ALLPATH-LG release 42781 
(Gnerre et al. 201 1) to take advantage of the availability of 
the mate-pair library. A subset of raw reads was randomly 
selected from each library to represent ~50x coverage for 
the initial draft assembly as suggested by the assembler 
documentation. 

Annotation and Comparative Analysis 

The procedures for genome annotation and comparative anal- 
ysis were based on those described previously (Ku et al. 201 3; 
Lo et al. 2013). The complete genome sequences were pro- 
cessed using RNAmmer (Lagesen et al. 2007), tRNAscan-SE 
(Lowe and Eddy 1997), and PRODIGAL (Hyatt et al. 2010) for 
gene predictions. The protein-coding genes were annotated 
based on the single-copy orthologous genes in the 5. melli- 
ferum IPMB4A genome (Lo et al. 2013) identified by 
OrthoMCL (Li et al. 2003) with a BLASTP (Altschul et al. 
1997; Camacho et al. 2009) e-value cutoff of 1 x 10" 15 . 
The protein-coding genes that did not have a single-copy 
ortholog in the 5. melliferum IPMB4A genome were manually 



curated based on the top 20 hits of BLASTP sequence similarity 
searches against the NCBI nonredundant protein (nr) 
database (Benson et al. 2012). The functional classification 
of protein-coding genes was inferred using the KAAS tool 
(Moriya et al. 2007) provided by the KEGG database 
(Kanehisa and Goto 2000; Kanehisa et al. 2010). The KEGG 
orthology assignment was further mapped to the COG func- 
tional categories (Tatusov et al. 1997, 2003). Genes that 
lacked COG assignment were assigned to a custom category 
(category X). The annotated chromosomes were plotted using 
CIRCOS (Krzywinski et al. 2009) for the visualization of gene 
locations, GC-skew, and GC content. 

To compare the chromosomal organization between dif- 
ferent Spiroplasma species, we utilized MAUVE v2.3.1 (Darling 
et al. 2010) for genome alignment. To estimate the genome- 
wide nucleotide sequence divergence level, we identified the 
single-copy orthologs in each genome pair using OrthoMCL 
(Li et al. 2003) with a BLASTN (Altschul et al. 1997; Camacho 
et al. 2009) e-value cutoff of 1 x 10~ 15 . The corresponding 
sequences were aligned using MUSCLE v3.8 (Edgar 2004) 
with the default settings and concatenated into a single align- 
ment for each pair. The DNADIST program of PHYLIP v3.69 
(Felsenstein 1989) was used to calculate the sequence 
identity. 

For the gene content comparison with honeybee-associ- 
ated S. melliferum, we merged the two draft genomes avail- 
able for this species (Alexeev et al. 201 2; Lo et al. 201 3) into a 
pan-genome to better represent its gene repertoire. For the 
comparison with other Mollicute lineages, we selected 
Mycoplasma mycoides subsp. capri LC str. 95010 (GenBank 
accession number NC_015431) (Thiaucourt et al. 201 1) and 
Mesoplasma florum L1 (NC_006055) to represent the 
Mycoides-Entomoplasmataceae clade, which is the sister 
group to the Apis clade that contain 5. diminutum and 
5. taiwanense (Gasparich et al. 2004). Additionally, M. pene- 
trans HF-2 (NC_004432) (Sasaki et al. 2002) was used as the 
outgroup for this comparison because it has the highest 
number of protein-coding genes among the Mycoplasma 
species with complete genome sequences available. For 
these gene content comparisons, the homologous gene clus- 
ters were identified using OrthoMCL (Li et al. 2003) with a 
BLASTP (Altschul et al. 1997; Camacho et al. 2009) e-value 
cutoff of 1 x 10 -15 . The 259 homologous gene clusters that 
contain one single orthologous gene from each of the species 
compared were used to infer a species phylogeny. The 
concatenated alignment contains 104,376 aligned amino 
acid sites and was used for PhyML analysis with the LG sub- 
stitution model (Le and Gascuel 2008). The clade supports 
were inferred by using 1 ,000 bootstrap samples. After obtain- 
ing the species phylogeny, the phylogenetic distribution 
pattern of homologous gene clusters was inferred based on 
the presence/absence of genes in each of the species 
compared. 



1514 Genome Biol. Evol. 5(8): 1 512-1 523. doi:10.1093/gbe/evt108 Advance Access publication July 19, 2013 



Complete Genome Sequences of Spiroplasma diminutum and 5. taiwanense 



GBE 



Results and Discussion 

Molecular Phylogeny of Mosquito-Associated 
Spiroplasma Species 

The maximum likelihood phylogeny inferred using the conca- 
tenated alignment of 16S rDNA and rpoB (fig. 1) is mostly 
congruent with a previous study that used only 16S rDNA 
and the maximum parsimony method (Gasparich et al. 
2004). The major inconsistencies are the placements of 
5. corruscae, 5. turonicum, 5. litorale, and 5. taiwanense. 
These species were thought to be sisters of the 5. apis- 
5. montanense clade (Gasparich et al. 2004) but our results 
provided alternative placements with low levels of bootstrap 
support. Because molecular phylogenies inferred using a lim- 
ited number of loci are often problematic, future improve- 
ments on the availability of molecular markers are required 
to resolve these uncertainties. 

Despite these uncertainties within the Apis clade, it is clear 
that the four mosquito-associated Spiroplasma species are 
quite divergent. This observation is consistent with the results 
from serotyping, which placed 5. culicicola, S. diminutum, 
S. sabaudiense, and 5. taiwanense in groups X, XXV, XIII, 
and XXII, respectively (Gasparich et al. 2004). Taken together, 
these results suggest that the association with mosquito hosts 
may have evolved independently among these Spiroplasma 
species. The comparison between 5. diminutum and 5. taiwa- 
nense is of particular interest because these two species were 
both isolated from mosquitoes collected in Taiwan during 
1980-1981 and appeared to overlap in their native host 
range. The three characterized strains of 5. taiwanense 
(CT-1 T , CT-2, and CT-3) were all isolated from C. tritaenior- 
hynchus (Abalain-Colloc et al. 1988). The two characterized 
strains of 5. diminutum, CUAS-1 T and CT-4, were isolated 
from C. annulus and C. tritaeniorhynchus, respectively 
(Williamson et al. 1996). 

Genome Sequences of S. diminutum and S. taiwanense 

The genomes of 5. diminutum and 5. taiwanense were 
sequenced to completion in this study (table 1 and fig. 2). 
Both genomes contain a circular chromosome that is 
~ 1.0 Mb in size (5. diminutum: 945,296 bp; 5. taiwanense: 
1,075,140 bp). The 5. taiwanense genome contains a circular 
plasmid that is 11,1 38 bp in size and encodes 11 protein- 
coding genes (1 SOJ-like protein and 10 hypothetical pro- 
teins); no plasmid was found in the 5. diminutum genome. 
The chromosomal GC contents are consistent with previous 
estimates obtained using biochemical methods, with 5. dimin- 
utum having a GC content of 25.5% (Williamson et al. 1996) 
and 5. taiwanense having a GC content of 23.9% (Abalain- 
Colloc et al. 1988). Both genomes contain a single ribosomal 
RNA gene cluster, which corresponds to the highest peak 
observed in the GC content plot (fig. 2; —71 1—71 6 kb in 
5. diminutum and -859-864 kb in 5. taiwanense). Both 



genomes encode 29 tRNA genes, which are fewer than 
those found in 5. citri and 5. melliferum (table 1). 

The genome alignment between 5. diminutum and 5. tai- 
wanense indicates that their chromosomes are largely syntenic 
except for a ~1 22 kb inversion that encompasses the putative 
replication terminus (fig. 3A). This conservation in chromo- 
somal organization was surprising because these two species 
are relatively divergent, with an average genome-wide nucle- 
otide sequence identity of 76.1% (calculated based on 652 
single-copy orthologous genes shared between these two ge- 
nomes, the concatenated alignment contains a total of 
668,307 aligned nucleotide sites). For comparison, the closely 
related 5. citri and 5. melliferum in the Citri clade have an 
average genome-wide nucleotide sequence identity of 
99.0% (based on 696 genes and 691,679 sites), yet exhibit 
extensive rearrangements (fig. 3B). This genome instability in 
the Citri clade may be explained by the presence of highly 
repetitive plectroviral fragments (table 1), which may have 
promoted their genome instability (Ye et al. 1996; Ku et al. 
2013; Lo etal. 2013). 

Despite the similarities described above, close inspections of 
the 5. diminutum-S. taiwanense comparison reveal several 
intriguing differences. First, most of the genome-specific re- 
gions in these two species are located near the putative rep- 
lication terminus (fig. 2), suggesting that these regions are 
hotspots for molecular evolution by accelerated sequence di- 
vergence or horizontal gene transfers. Intriguingly, this clus- 
tering of species-specific genes was not found in a comparison 
between 5. chrysopicola and 5. syrphidicola (Ku et al. 201 3). It 
is unclear whether this difference was due to the fact that 
these two species pairs are sampled from different 
Spiroplasma clades or because the divergence levels are 
quite different (i.e., the average genome-wide nucleotide 
identity between 5. chrysopicola and 5. syrphidicola is 
-92.2%, which is much higher than the 5. diminutum- 
S. taiwanense comparison). Second, while no pseudogene 
was found in the 5. diminutum genome, we identified 54 
putative pseudogenes with premature stop codons and/or 
frameshift indels in 5. taiwanense (table 1). These pseudo- 
genes include those involved in carbohydrate uptake (treB, 
fruA, celB, nagB, and sgaB), carbohydrate metabolism (glpX, 
scrB, bgl, and lacG), and homologous recombination {ruvA 
and ruvB). Additionally, 5. taiwanense contains many more 
long intergenic regions (>300bp) than 5. diminutum (87 vs. 
32), which may harbor highly degraded pseudogenes that 
cannot be easily identified by sequence similarity searches. 
This increase in pseudogene numbers is similar to those 
found in the genomes of recent or facultative pathogens 
(Ochman and Davalos 2006). Furthermore, the observed 
genome degradations suggest that 5. taiwanense may have 
a smaller effective population size than 5. diminutum, which is 
consistent with the field isolation records that 5. taiwanense 
has a narrower natural host range (Abalain-Colloc et al. 1 988; 
Williamson et al. 1996). Consequently, the smaller effective 
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population size has resulted in elevated levels of genetic drift 
and increased accumulation of slightly deleterious mutations 
(Kuo et al. 2009; Kuo and Ochman 2009, 201 0). Interestingly, 
a similar pattern of genome degradation is also observed in 
the pathogenic 5. citri and 5. melliferum (table 1), both of 
which have lost the recombinase A gene (recA) that is required 
for DNA repair by homologous recombination (Marais et al. 
1 996; Carle et al. 201 0; Alexeev et al. 201 2; Lo et al. 201 3). In 
contrast, these DNA repair-related genes (e.g., recA, ruvA, 
ruvB, etc.) are still intact in the 5. diminutum genome, which 
may explain why this genome has the lowest incidence of 
pseudogenes and the highest coding density among the 
Spiroplasma genomes reported to date (Carle et al. 2010; 
Alexeev et al. 2012; Lo et al. 2013; Ku et al. 2013). 

Comparison of Substrate Utilization Strategies 

To investigate the genetic mechanisms that may be involved in 
utilizing mosquito hosts and the possible explanations of 
differences in the pathogenicity inferred from previous artifi- 
cial infection experiments (Chastel and Humphery-Smith 
1991; Humphery-Smith et al. 1991a, 1991b; Vorms-Le 
Morvan et al. 1991; Vazeille-Falcoz et al. 1994; Phillips and 
Humphery-Smith 1995), we compared the substrate utiliza- 
tion strategies of 5. diminutum and 5. taiwanense based on 
their annotated transporters and metabolic enzymes (fig. 4). 
The results indicate that both species are capable of importing 
and utilizing glucose, fructose, and A/-acetylglucosamine 
(GlcNAc). However, the genes involved in the utilization of 
trehalose (treA and treB), cellobiose (celB), sucrose {scrB and 
scrK), and A/-acetylmu ramie acid (MurNAc; murP and murQ) 
are found in S. diminutum but not S. taiwanense. Among 
these substrates, cellobiose and MurNac may be derived 
from algae and bacteria that are consumed by mosquito 
larvae, sucrose is the major carbohydrate in nectar and plant 
sap consumed by adult mosquitoes, and trehalose is the most 
abundant sugar in insect hemolymph (Becker et al. 1 996; Blatt 
and Roces 2001). The flexible sugar usage capacity suggests 
that S. diminutum is well suited to the environment in mos- 
quito gut and may be capable of living in the host circulatory 
system as well. 

Most of these S. diminutum-spec\f\c genes appear to have 
been lost in the 5. taiwanense genome through pseudogen- 
ization (see above). The loss of trehalose utilization genes (treA 
and treB) suggests that 5. taiwanese may face limited carbo- 
hydrate supplies in host hemolymph, which is consistent with 
the observation that 5. taiwanense cells often display postex- 
ponential morphologies in the hemolymph of infected Ano. 
Stephens/' (Phillips and Humphery-Smith 1995). Intriguingly, 
we found that the 5. taiwanense genome encodes a copy 
of glycerol-3-phosphate oxidase (glpO), which can be used 
to produce hydrogen peroxide (H 2 0 2 ) and reactive oxygen 
species (ROS). This gene has been shown to be a major viru- 
lence factor that causes host tissue inflammation and cell 
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Fig. 2. — Genome maps of Spiroplasma diminutum and 5. taiwanense. Rings from the outside in: (1) scale marks; (2) protein-coding genes on the 
forward strand; (3) protein-coding genes on the reverse strand (color-coded by the functional categories); (4) rRNA (purple) and tRNA genes (green); 
(5) pseudogenes (orange) and intergenic regions >300bp (black); (6) species-specific regions identified in the pairwise comparison between S. diminutum 
(blue) and 5. taiwanense (red); (7) GC skew; and (8) GC content. 



death in M. mycoides (Pilo et al. 2005, 2007) and may con- 
tribute to the tissue damage (Phillips and Humphery-Smith 
1995) and higher mortality rates (Humphery-Smith et al. 
1991a, 1991b; Vazeille-Falcoz et al. 1994) observed in S. tai- 
wanense-infected mosquitoes. It will be interesting to examine 
the timing and tissue-specificity of glpO activation and to in- 
vestigate the link to stress responses in future empirical 
studies. 

In contrast to the deficiencies in carbohydrate utilization, 
5. taiwanense may be more efficient in oligopeptide uptake 
compared with 5. diminutum. The gene cluster that encodes 
for oligopeptide ABC transporters appears to have experi- 
enced tandem duplications and exists in three copies on the 
S. taiwanense chromosome (-820-847 kb). In addition to the 
lysed host cells, the digested blood meal in the gut of female 



mosquitoes can provide abundant substrates for these trans- 
porters. Taken together, although 5. diminutum and 5. taiwa- 
nense are both associated with Culex mosquitoes in Southeast 
Asia (Abalain-Colloc et al. 1 988; Williamson et al. 1 996), their 
substrate utilization strategies for utilizing these closely related 
hosts appear to be quite different. 

Gene Content Comparison with the Honeybee- 
Associated 5. melliferum 

Two previously published genome sequences of the honey- 
bee-associated 5. melliferum (Alexeev et al. 2012; Lo et al. 
2013) provide an opportunity for comparative analysis of 
gene content between two major clades of Spiroplasma 
(fig. 1). A three-way comparison among 5. melliferum- 
S. diminutum-S. taiwanense revealed that these species 
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shared a total of 472 homologous gene clusters (fig. 5 and 
supplementary table S3, Supplementary Material online). In 
addition to the essential genes conserved across all bacterial 
genomes such as those involved in DNA replication, transcrip- 
tion, translation, and other fundamental cell processes 
(Koonin 2003; Lapierre and Gogarten 2009; Chen et al. 
201 2), we found that these spiroplasmas all have the glycolysis 
pathway to convert phosphorylated sugars into pyruvate for 
energy generation, the nonmevalonate pathway (dxs, dxr, 
ispD, ispF, ispG, and ispH) to synthesize isopentenyl pyrophos- 
phate (IPP) for terpenoid backbone, and oligopeptide ABC 
transporters (oppA, oppB, oppC, oppD, and oppF) to import 
amino acids for peptide synthesis. Furthermore, these ge- 
nomes contain the genes required for nucleotide biosynthesis 
from nucleobases (adenine, guanine, uracil, and xanthine) and 
a nucleoside (thymidine). The presence of these genes is in 



agreement with the previous findings that spiroplasmas have 
more flexible metabolic capabilities compared with mycoplas- 
mas and phytoplasmas (Carle et al. 201 0; Chen et al. 201 2; Lo 
et al. 2013), which may contribute to their lower degree of 
host dependence. 

Other than the metabolic genes and transporters described 
above, these insect-associated spiroplasmas shared several 
genes related to oxidative stress resistance such as those in- 
volved in iron-sulfur (Fe-S) cluster synthesis (sufS, sufU, sufB, 
sufC, and sufD). The organization of this suf operon is con- 
served within Spiroplasma and other Gram-positive bacteria, 
while distinct from those found in Gram-negative bacteria 
(Riboldi et al. 2009). Additionally, these spiroplasmas all 
have the thiol peroxidase (tpx), which has been shown to be 
important in protecting Enterococccus faecalis cells inside 
mouse macrophages (La Carbona et al. 2007). Taken 
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together, these genes may protect these insect-associated 
bacteria against the reactive oxygen intermediates generated 
by the host immune system (Cerenius et al. 2008). 

In terms of species-specific gene clusters, 5. melliferum has 
the highest number compared with 5. diminutum and 




0 2 HjOj/ROS 



Fig. 4. — Sugar uptake and utilization. Comparison of the phospho- 
transferase system (PTS) transporters and enzymes involved in sugar 
uptake and utilization between 5. diminutum and 5. taiwanense. Gene 
names are color-coded according to their patterns of presence/absence 
(gray: shared; blue: 5. diminutum-spetific, red: 5. te/wanense-specific). 
DHAP, dihydroxyacetone phosphate; G3P, glycerol 3-phosphate; 
GlcNAc, A/-acetylglucosamine; MurNAc, A/-acetylmuramic acid; ROS, reac- 
tive oxygen species. 



5. taiwanense (435, 134, and 281, respectively). While most 
of these species-specific genes are annotated as hypothetical 
proteins with unknown functions, some have more detailed 
annotation for inferring the functional significance. For exam- 
ple, S. melliferum has the entire gene set for arginine catab- 
olism {arcA, arcB, and arcQ, which is consistent with the 
biochemical assay results that this species can hydrolyze argi- 
nine (Clark et al. 1985) whereas S. diminutum and S. taiwa- 
nense cannot (Abalain-Colloc et al. 1988; Williamson et al. 
1996). This ability for arginine hydrolysis can contribute to 
energy generation and provide organic nitrogen, which 
allows for more flexible metabolisms and may promote cell 
growth when other energy sources are limited (Pereyre et al. 
2009). Moreover, 5. melliferum has the gene set for uridine 
monophosphate (UMP) synthesis (pyrB, pyrC, pyrD, pyrE, and 
pyrF), which may reduce its dependence on the host for 
nucleotides. Finally, a large number of S. melliferum-spec\f\c 
genes are originated from plectroviral invasion of this genome 
and the associated horizontal gene transfer (Alexeev et al. 
2012; Lo et al. 2013). 

One important finding from this among-species compari- 
son is related to the variable patterns of carbohydrate uptake 
and utilization. Extending the results from the S. diminutum- 
5. taiwanense comparison as discussed above, we found that 
the phosphotransferase system (PTS) transporters for import- 
ing glucose and fructose appear to be conserved among the 
spiroplasmas characterized to date. Although the PTS trans- 
porter for importing GlcNAc (nagE) is shared by these three 
species, it was not found in the draft genome assembly of the 
phytopathogenic 5. citri (Carle etal. 2010; Lo etal. 2013). It is 
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not clear whether the absence of this gene in 5. citri was due 
to true loss or the incompleteness of its draft genome assem- 
bly. The pattern for sucrose uptake was unclear for the same 
reason as well because while the corresponding gene (scrA) 
was not found in either 5. melliferum or 5. citri, this gene may 
reside in the unassembled parts of these two genomes. 
Nonetheless, the availability of the complete genome 
sequence of 5. taiwanense suggests that the ability to utilize 
trehalose, cellobiose, and MurNAc is dispensable. 

Comparison with the Mycoides-Entomoplasmataceae 
Clade and Inference of Gene Content Evolution 

The genus Spiroplasma is known to be a paraphyletic group 
with the Mycoides-Entomoplasmataceae clade (containing 
M. mycoides and other nonhelical species assigned to the 
genera Mesoplasma and Entomoplasma) as its descendants 
(Gasparich et al. 2004). Because the Apis clade (containing the 
5. diminutum and 5. taiwanense reported in this study) is the 
sister group to the Mycoides-Entomoplasmataceae clade 
(fig. 1), the availability of these two new genome sequences 
provides an opportunity to infer the gene content evolution 
among these bacteria. 

To investigate this question, we identified 259 single-copy 
genes shared among selected Mollicutes genomes for phylo- 
genetic inference. The organismal phylogeny inferred from 
the concatenated alignment based on the maximum likeli- 
hood method received 1 00% bootstrap support on all internal 
branches (fig. 6) and is consistent with our current under- 
standing of Mollicutes evolution (Gasparich et al. 2004). 
Using this phylogeny as the framework, we inferred putative 
events of gene gains and losses based on the pattern of gene 
presence and absence in each of the genome compared (fig. 6 
and supplementary table S4, Supplementary Material online). 
Although it is reasonable to hypothesize that some of the 
putative gene gains may have contributed to important func- 
tions, such inference was difficult because most of the line- 
age-specific genes are annotated as hypothetical proteins 
without functional description. Rather, the main finding 
from this analysis is that losses of biosynthetic pathways 
appear to be a recurrent theme among these host-associated 
bacteria (Ochman and Davalos 2006; McCutcheon and 
Moran 2011). For example, the genes involved in arginine 
catabolism and UMP synthesis as described above appear to 
have been lost in the common ancestor of the Apis and 
Mycoides-Entomoplasmataceae clades. Moreover, the genes 
involved in the synthesis of IPP and Fe-S cluster appear to have 
been lost in the common ancestor of the Mycoides- 
Entomoplasmataceae clade. 

Finally, we found that all Spiroplasma genomes character- 
ized to date have at least five copies of mreB (Ku et al. 201 3), 
which encodes the cell shape determining protein MreB and 
has been linked to the helical morphology of these bacteria 
(Kurner et al. 2005). However, this gene is present as a single 
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compared; similarly, three gene clusters are missing in these two 
Spiroplasma species but are present in all four other species. 



copy gene in the Mes. florum genome and was not found in 
either of the Mycoplasma genomes. Because this gene was 
found in several Firmicutes genomes but not most of the 
Mollicutes genomes (Chen et al. 2012), it is possible that 
this gene was acquired by the common ancestor of spiroplas- 
mas (possibly through horizontal gene transfer). 
Subsequently, gene family expansion by duplication occurred 
and allowed for subfunctionalization (and possibly neofunc- 
tionalization) of different copies, which contributed to the 
distinct helical shape of spiroplasma cells. The losses of these 
genes in the common ancestor of the Mycoides- 
Entomoplasmataceae clade are likely to be responsible for 
the reversion back to nonhelical shape of these descendants 
of spiroplasmas. 

Conclusions 

In summary, this study provides the first set of complete 
genome sequences for two Spiroplasma species in the Apis 
clade, which is the most diverse group within this genus. The 
conservation in chromosome organization suggests that these 
sequences may be used as the references for future genomic 
studies in related species. Through comparative analysis at 
different phylogenetic depths, we identified several genetic 
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mechanisms that may explain the results of previous pheno- 
typic characterizations (metabolism, pathogenicity, etc.). For 
future work, genomic characterizations and functional studies 
that include other mosquito-associated spiroplasmas can fur- 
ther improve our understanding of the diverse genetic mech- 
anisms of utilizing similar hosts among these phylogenetically 
distinct bacteria. Additionally, more comprehensive evalua- 
tions of the pathogenicity of each Spiroplasma species in dif- 
ferent mosquitoes, particularly the native hosts, are required 
to investigate bacterium-host interactions. At a deeper diver- 
gence level, genomic characterization of the basal Ixodetis 
clade is required to shed light on the genome evolution in 
the genus Spiroplasma and its nonhelical descendants. 

Supplementary Material 

Supplementary tables S1-S4 are available at Genome Biology 
and Evolution online (http://www.gbe.oxfordjournals.org/). 
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